Introduction to FEATURE ENGINEERING & DATA PREPARATIONΒΆ

  • Dealing with Outliers
  • Dealing with Missing Data
    1. Evaluation of Missing data
    2. Filling and droping data based on rows
    3. Fixing data based on columns
  • Dealing with Categorical Data - Encoding Options

Feature EngineeringΒΆ

image.png
image.png
image.png
image.png
image.png
image.png
image.png
image.png
image.png
image.png

Encoding Options for Categorical DataΒΆ

image.png
image.png
image.png
image.png

Issue with Integer EncodingΒΆ

image.png
image.png
image.png
image.png

Pros and cons of Intger EncodingΒΆ

image.png

One Hot Encoding - (Dummy Variables)ΒΆ

image.png
image.png
image.png
image.png
image.png
image.png
image.png
image.png
image.png

This can be extended to more than two categories.ΒΆ

image.png

  • We can see that 2 columns are created and for CANADA it will be automatically interpreted where USA and MEX is 0

Pros anc Cons of One Hot EncodingΒΆ

image.png
image.png
image.png

End of the SessionΒΆ